changed default value of zeroing-threshold in BackpropTruncationCompo… by freewym · Pull Request #1240 · kaldi-asr/kaldi

freewym · 2016-12-02T04:42:35Z

…nent to 15; updated the results on AMI

danpovey · 2016-12-02T04:51:41Z

src/nnet3/nnet-general-component.cc

-  BaseFloat clipping_threshold = 15.0;
-  BaseFloat zeroing_threshold = 2.0;
+  BaseFloat clipping_threshold = 30.0;
+  BaseFloat zeroing_threshold = 15.0;


Larger values of these quantities are more dangerous, i.e. more likely to lead to instability.
I don't think it's sufficient to just test this on one setup, because it's the potential for divergence that this is supposed to guard against. Have you done any other tests?

danpovey · 2016-12-02T04:58:02Z

Also, this PR would need to be on top of the 'fast_lstm' branch-- there are other LSTM config-generation objects there that would have to be changed. But I want you to test it in that setup. And I'd be more comfortable with smaller thresholds, like 5 and 15 or 5 and 20, instead of 20 and 30, if there is no clear difference in results. It's safer in situations where divergence is a possibility. The WER improvements you had in the RESULTS file were rather unimpressive. Dan

…

On Thu, Dec 1, 2016 at 11:42 PM, Yiming Wang ***@***.***> wrote: …nent to 15; updated the results on AMI ------------------------------ You can view, comment on, or merge this pull request online at: #1240 Commit Summary - changed default value of zeroing-threshold in BackpropTruncationComponent to 15; updated the results on AMI File Changes - *M* egs/ami/s5b/RESULTS_ihm <https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-0> (5) - *M* egs/ami/s5b/RESULTS_sdm <https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-1> (5) - *M* egs/wsj/s5/steps/libs/nnet3/xconfig/lstm.py <https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-2> (16) - *M* egs/wsj/s5/steps/nnet3/components.py <https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-3> (4) - *M* egs/wsj/s5/steps/nnet3/lstm/make_configs.py <https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-4> (2) - *M* src/nnet3/nnet-general-component.cc <https://github.com/kaldi-asr/kaldi/pull/1240/files#diff-5> (4) Patch Links: - https://github.com/kaldi-asr/kaldi/pull/1240.patch - https://github.com/kaldi-asr/kaldi/pull/1240.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1240>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu8mTeUUEuaIERTq7Hd83XVmtbsGjks5rD6G9gaJpZM4LCMeq> .

freewym · 2016-12-02T05:41:18Z

This PR is already on top of fast_lstm.

The old WERs reported in RESULTS are obtained without zeroing (i.e. using ClipGradientComponent as the comment said). The results of tuning zeroing-threshold on ihm are:
threshold WER dev|eval
4.0 22.7|22.8
5.0 22.4|22.6
6.0 22.4|22.7
10.0 22.5|22.6
15.0 22.4|22.4

I have not tuned it on sdm1

After the fix of max-deriv-time, the gradient explosion did not happen even on the babel georgian multicondition data (which had the most severe problem before the fix): when I disabled the zeroing, the clipped-proportion is at most ~0.004.

I also tuned the zeroing-threshold on swbd blstm_6i:
threshold WER fg|tg
5.0 14.5|15.8
6.0 14.4|15.8
10.0 14.4|15.8
15.0 14.2|15.6

The reason I chose 15.0 as threshold rather than 5 or 10 is mainly based on the WER on swbd. Not sure how much variation there could be for different runs with the same settings

danpovey · 2016-12-02T05:52:35Z

OK I will think about it.

freewym · 2016-12-02T06:05:24Z

FYI, added the zeroed-proportion stats of the 1st layer at the last iteration, which shows how often zeroing was activated:

ami ihm
threshold zeroed-prop
4.0 0.06
5.0 0.045
6.0 0.035
10.0 0.015
15.0 0.006

swbd
threshold zeroed-prop
5.0 0.018
6.0 0.014
10.0 0.003
15.0 0.0007

danpovey · 2016-12-02T19:40:26Z

OK, I'll merge this.

changed default value of zeroing-threshold in BackpropTruncationCompo…

a567df9

…nent to 15; updated the results on AMI

danpovey reviewed Dec 2, 2016

View reviewed changes

danpovey merged commit 1cab8bd into kaldi-asr:fast_lstm Dec 2, 2016

freewym deleted the fast_lstm branch December 2, 2016 20:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

changed default value of zeroing-threshold in BackpropTruncationCompo…#1240

changed default value of zeroing-threshold in BackpropTruncationCompo…#1240
danpovey merged 1 commit intokaldi-asr:fast_lstmfrom
freewym:fast_lstm

freewym commented Dec 2, 2016

Uh oh!

danpovey Dec 2, 2016

Uh oh!

danpovey commented Dec 2, 2016 via email

Uh oh!

freewym commented Dec 2, 2016 •

edited

Loading

Uh oh!

danpovey commented Dec 2, 2016

Uh oh!

freewym commented Dec 2, 2016

Uh oh!

danpovey commented Dec 2, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

freewym commented Dec 2, 2016

Uh oh!

danpovey Dec 2, 2016

Choose a reason for hiding this comment

Uh oh!

danpovey commented Dec 2, 2016 via email

Uh oh!

freewym commented Dec 2, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danpovey commented Dec 2, 2016

Uh oh!

freewym commented Dec 2, 2016

Uh oh!

danpovey commented Dec 2, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

freewym commented Dec 2, 2016 •

edited

Loading